Bank Card Usage Prediction Exploiting Geolocation Information

نویسندگان

Martin Wistuba

Nghia Duong-Trung

Nicolas Schilling

Lars Schmidt-Thieme

چکیده

We describe the solution of team ISMLL for the ECMLPKDD 2016 Discovery Challenge on Bank Card Usage for both tasks. Our solution is based on three pillars. Gradient boosted decision trees as a strong regression and classification model, an intensive search for good hyperparameter configurations and strong features that exploit geolocation information. This approach achieved the best performance on the public leaderboard for the first task and a decent fourth position for the second task. 1 Challenge Description The goal of one of this year’s ECML-PKDD Discovery Challenges was to predict the behaviour of customers of the Hungarian bank otpbank. The challenge was divided into two tasks. The first task was to predict for every bank branch the number of visits for a set of customers, the second task was to predict, whether a customer will apply for a credit card in the next six months. For these tasks, anonymized customer information (e.g. age, location, income, gender) and bank activities (e.g. what has been bought, where and when) were provided. A labeled data set for 2014 was made available which can be used for supervised machine learning to predict the targets for a disjoint set of customers for 2015. The evaluation measure for Task 2 is the area under the ROC curve (AUC), a very common measure for imbalanced classification problems. The evaluation measure for Task 1 is a little bit more exotic. It is the average of cosine@1 and cosine@5 for every customer c where cosine@k := ∑k i=1 yc,iŷc,i √∑b i=1 y 2 c,i √∑k i=1 ŷ 2 c,i (1) with yc,i being the number of times the customer c has visited bank branch i and ŷc,i the prediction, respectively. There are b different branches in total. For more information we refer to the challenge website [1]. ar X iv :1 61 0. 03 99 6v 1 [ cs .L G ] 1 3 O ct 2 01 6 2 Problem Identification For the first task, we assumed that there is no relation between the number of visits of a customer among branches. This enabled us to tackle b different regression tasks for each of the b branches. Independently, we trained a regression model for each branch that predicts for a customer how often she will visit that branch based on past information for that branch. This is a classical example for count data and hence, we tackled this task as a Poisson regression problem. For Task 1 we had to select five bank branches for which we wanted to make predictions. We simply chose the five with highest predicted number of visits which is the best way to achieve a good score in case the predictor performs reasonable. We considered Task 2 to be a classification task. We minimized the logistic loss and considered the class imbalance by choosing an appropriate class weight. For both tasks, we used gradient boosted decision trees [2] as the prediction model. 3 Data Preprocessing For the feature and hyperparameter selection we had to split the labeled data set into a training data set Dtrain and a validation data set Dvalid such that the performance on Dvalid will reflect the performance on the hidden test data. The task was to infer from some customers and their activities in 2014 the behaviour of a disjoint set of customers in 2015. Only basic customer information as well as the customer’s activities of the first half of 2015 (excluding branch visits) was given for the test customers. Thus, we decided to split the given labeled data set by customers, selecting 80% for Dtrain and the remaining 20% for Dvalid uniformly at random. Only the first six months of activities of the validation customers (excluding branch visits) was provided for validation purposes. The only problem here is that we are actually predicting from data from 2014 for customers in 2014 but there was no way to overcome this problem. Very basic information of the customers was available including age, location, income and gender. While gender is by nature binary, the other features were already binned into three categories. We employed this information as features after transforming them via one-hot encoding. Furthermore, the internal classification of a bank whether the customer is considered as wealthy or not was given for each month. We distinguished customers of following five categories: customers that have been classified as 1) wealthy in all observed months, 2) not wealthy in all observed months, 3) first wealthy and then changed to not wealthy, 4) first not wealthy and then changed to wealthy, 5) those who changed their classification more than once. Applying one-hot encoding, we added this information as features. Finally, the information in what month the customer possesses a credit card of the bank was provided. Analogously to the five categories of the wealthy classification, we created categories for the credit card time-series information. Besides using basic customer features, we wanted to use the information of the customer’s activities. While we found many features that improved the performance for Task 2 on our internal data split, we saw for many features no improvement on the public leaderboard. Thus, the only feature we used is the number of activities per channel committed by the customer. Figure 1 shows that it is one of the most predictive features. 0.00 0.05 0.10 0.15 0.20 0.25 A ge (< 5) A ge (6− 5) A ge (> 5) Loation (cpital) Loation (ity) Loation (vage) Icom e low ) Icom e (m eium ) Icom e (high) Icom e (o) G eder C rdit C rd (es) C rdit C rd (o) C rdit C rd (ot one) C rdit C rd (ost it) C rdit C rd (vaiable) W elthy N ot w elthy B ecam e w elthy B ecam e ot w elthy V aable w elthy P O S acvities W eshop acvities A civity (cpital) A civity (ity) A civity (vage) A civity 5− 11h) A civity (2− 18h) A civity (9− h) C rd U sge (debit crd) C rd U sge (cedit crd) R el at iv e R el ev an ce Fig. 1. This plot visualizes the relative number of times a feature was chosen to build a tree in Task 2. The activity features are used in almost every fourth tree. For Task 2 we considered location information about activities, bank branches and customers to be irrelevant and only used aforementioned features. However, for Task 1 this information was one of the most impactful information. One feature we used was the distance between the residence of the customer and a bank branch which is a quite obvious choice. Digging into the data, we saw that there were many customers using bank branches very far away from their residence. We tried to cover this by also adding the maximum, minimum, mean and median distance between a bank branch and the customer’s activities. Finally, we added k-nearest-neighbors predictions for k = 2, 2, . . . , 2 using the Euclidean distance between the residence of customers as the distance function. These features follow the simple assumption that customers that live nearby visit the same bank branches. Figure 2 provides insight into our intermediate feature selection experiments for Task 1 and clearly shows the importance of the location-aware features. Based on this experiment, we used all features but the credit card information for Task 1. Figure 3 shows the relative frequency of a specific feature being taken as a splitting variable. Again, this shows the importance of location-aware features for Task 1. 0.66 0.67 0.68 0.69 0.70 A ll N o activ/branch diance N o ge N o chnnel ativity N o cedit card N o cedit card nd w elthy N o cedit crd, w elthy nd usebranch diance N o gnder N o incom e N o kN N N o kN N or k> 1 N o loation N o residce/branch diance N o w elthy O ly age, kN N nd activ/branch diance E va lu at io n M ea su re Fig. 2. Intermediate feature backward selection results for Task 1. Location-aware features provide huge improvements. 0.000 0.025 0.050 0.075 0.100 A ge (< 5) A ge (6− 5) A ge (> 5) Loation (cpital) Loation (ity) Loation (vage) Icom e low ) Icom e (m eium ) Icom e (high) Icom e (o) G eder W elthy N ot w elthy B ecam e w elthy B ecam e ot w elthy V aable w elthy P O S acvities W eshop acvities A ll acvities R esidee/branch diance A ctiv/branch diance (m n) A ctiv/branch diance (m x) A ctiv/branch diance (m ean) A ctiv/branch diance (m eian) 1− N N P reiction 2− N N P reiction 4− N N P reiction 8− N N P reiction 6− N N P reiction 2− N N P reiction 4− N N P reiction 128− N N P reiction 256− N N P reiction 512− N N P reiction 124− N N P reiction R el at iv e R el ev an ce Fig. 3. This plot visualizes the relative relevance of all features used in Task 1. The higher the score, the more often the feature was used for building a tree. Location-aware features prove to be highly predictive. 4 Hyperparameter Tuning and Ensembling For both tasks we tuned the hyperparameters by considering the choice of hyperparameters λ as a black-box optimization problem argmin λ L (ŷλ (Dvalid) , yvalid) (2) where ŷλ is the model that was trained on the training partition of the data Dtrain using hyperparameter configuration λ and ŷλ (Dvalid) the corresponding predictions for the validation partitionDvalid. Then, the problem of hyperparameter tuning is to find a hyperparameter configuration λ such that a loss function L given the predictions and the groundtruth is minimized. We tackled this black-box optimization problem using Sequential Modelbased Optimization (SMBO) [3]. Figure 4 presents the progress of the optimization process that was conducted in parallel on 100 cores for our own train/validation split as well as results on the public leaderboard for Task 1. For Task 2, we tried diverse ways of ensembling using different base models but did not achieve any improvement. In the end, we averaged the predictions of 100 models for the best hyperparameter configuration using different seeds.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Suspicious Card Transactions in unlabeled data of bank Using Outlier Detection Techniqes

With the advancement of technology, the use of ATM and credit cards are increased. Cyber fraud and theft are the kinds of threat which result in using these Technologies. It is therefore inevitable to use fraud detection algorithms to prevent fraudulent use of bank cards. Credit card fraud can be thought of as a form of identity theft that consists of an unauthorized access to another person's ...

متن کامل

Exploiting Contextual Information for Fine-Grained Tweet Geolocation

متن کامل

Exploiting Natwest and RBS online banking systems for profit

The Natwest and Royal Bank of Scotland (RBS) online banking systems are vulnerable to a remote attack which allows an adversary to steal money from a customer’s account. The vulnerability has arisen as a result of poor software engineering practice which neglected security. More precisely, the authentication mechanisms used by Natwest and RBS are dependent on six pieces of customer data, namely...

متن کامل

Protecting against Credit Card Forgery with Existing Magnetic Card Readers

Existing magnetic cards adopt plain text to store confidential information, thus being vulnerable to an untrusted credit card reader or a skimming device. To tackle the problem, researchers have proposed many solutions such as integrated circuit card (IC card) and mobile wallets applications [6, 9, 12, 20, 21]; however, none of them can support existing magnetic card readers thereby facing sign...

متن کامل

Working Paper Series the Economics of Two-sided Payment Card Markets: Pricing, Adoption and Usage Wp 12-06 James Mcandrews Federal Reserve Bank of New York the Economics of Two-sided Payment Card Markets: Pricing, Adoption and Usage *

This paper provides a new theory for two-sided payment card markets. Adopting payment cards requires consumers and merchants to pay a fixed cost, but yields a lower marginal cost of making payments. Analyzing adoption and usage externalities among heterogeneous consumers and merchants, our theory derives the equilibrium card adoption and usage pattern consistent with empirical evidence. Our ana...

متن کامل

An analysis of mobile credit card usage intentions

Purpose – Many banks consider mobile-based technologies have improved the banking services through introduction of new banking facilities. One of the latest facilities developed in this area is the “mobile credit card.” The purpose of this study is to examine the factors that determine intention to use mobile credit card among Malaysia bank customers, as their new way in conducting payment tran...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1610.03996 شماره

صفحات -

تاریخ انتشار 2016

Bank Card Usage Prediction Exploiting Geolocation Information

نویسندگان

چکیده

منابع مشابه

Detecting Suspicious Card Transactions in unlabeled data of bank Using Outlier Detection Techniqes

Exploiting Contextual Information for Fine-Grained Tweet Geolocation

Exploiting Natwest and RBS online banking systems for profit

Protecting against Credit Card Forgery with Existing Magnetic Card Readers

Working Paper Series the Economics of Two-sided Payment Card Markets: Pricing, Adoption and Usage Wp 12-06 James Mcandrews Federal Reserve Bank of New York the Economics of Two-sided Payment Card Markets: Pricing, Adoption and Usage *

An analysis of mobile credit card usage intentions

عنوان ژورنال:

اشتراک گذاری